58 research outputs found
Approximation theory of transformer networks for sequence modeling
The transformer is a widely applied architecture in sequence modeling
applications, but the theoretical understanding of its working principles is
limited. In this work, we investigate the ability of transformers to
approximate sequential relationships. We first prove a universal approximation
theorem for the transformer hypothesis space. From its derivation, we identify
a novel notion of regularity under which we can prove an explicit approximation
rate estimate. This estimate reveals key structural properties of the
transformer and suggests the types of sequence relationships that the
transformer is adapted to approximating. In particular, it allows us to
concretely discuss the structural bias between the transformer and classical
sequence modeling methods, such as recurrent neural networks. Our findings are
supported by numerical experiments
On Matching, and Even Rectifying, Dynamical Systems through Koopman Operator Eigenfunctions
Matching dynamical systems, through different forms of conjugacies and
equivalences, has long been a fundamental concept, and a powerful tool, in the
study and classification of nonlinear dynamic behavior (e.g. through normal
forms). In this paper we will argue that the use of the Koopman operator and
its spectrum is particularly well suited for this endeavor, both in theory, but
also especially in view of recent data-driven algorithm developments. We
believe, and document through illustrative examples, that this can nontrivially
extend the use and applicability of the Koopman spectral theoretical and
computational machinery beyond modeling and prediction, towards what can be
considered as a systematic discovery of "Cole-Hopf-type" transformations for
dynamics.Comment: 34 pages, 10 figure
- …